Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] RzIL uplifiting #3837

Merged
merged 14 commits into from
Mar 22, 2024
Merged

[Hexagon] RzIL uplifiting #3837

merged 14 commits into from
Mar 22, 2024

Conversation

Rot127
Copy link
Member

@Rot127 Rot127 commented Sep 8, 2023

SQUASH ME

commit message: Uplift Hexagon architecture to RzIL

The general structure is, that every (sub-)instruction has a getter for it's RzIL code.
Calling the getter will return the RzIL operation.
If RzIL for an instruction is requested, the plugin makes a decision. Because Hexagon only executes whole instruction packets. If the instruction is not the last instruction in a packet, it will simply return EMPTY(). If the RzIL for the last instruction in a packet is requested, it will get the RzIL operations for all instructions in the packet, shuffles them into the correct execution order (according to some rules) and returns the complete operation for the packet.

The RzIL code was entirely generated with the rzil-compiler, using the semantic definition of the QEMU Hexagon module.

Currently successful compile instructions (and tested):

[*] 1581/1733 standard instructions compiled.
[*] 431/643 HVX instructions compiled.
[*] In total: 2012/2376 instructions compiled.

It was tested with:

  • (Semantic tests) rz-tracetest against the execution trace of the QEMU Hexagon test binaries.
  • (Bug free and semi-semtantic test) Adding tests which simply execute the test binaries to ensure leak and segfault free execution. Also it is executed until a certain instruction is reached (end of main or loc.pass symbol), partially testing it executes correctly.

For the uplifting several changes and modernization had to be made:

  • Enhance consistency of decoding
    • Allow to disassemble an instruction without copying the result. This is used if the given buffer of instruction bytes is larger than one instruction width. In this case, as many instructions as the buffer can hold are disassembled and buffered for later.
    • Generally enhance buffering of instructions.
    • Allow to mark a packet as valid before it is completely decoded (in case we know it must be valid, e.g. if it is a jump target of a valid packet).
  • Fix (hopefully) all memory leaks of the Hexagon plugin.
  • Changes to register getters, because RzIL needs finer control to translate alias or explicit register names to their real register.
    • Getter for register name is now done by table, so for future distinction between DSP version we can just select another table.
    • Translation functions from register alias or explicit name to their real register.
    • Each operand contains now it's variable ID (e.g. d for register Rd) as in the ISA (for mapping in the RzIL code).
  • Ease debugging by tracking in more precision, if an instruction is added to a stale, active or new packet.
  • Add registers C20 - C29 (not yet present in LLVM)
  • Some renaming to make the code more readable.

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Hexagon uplifting PR.

TODO after merge

  • rz-rzilcompiler
    • Transfer ownership to RizinOrg
    • Tag rz-rzilcompile as v1.0
  • rz-hexagon
    • Change submodule to use RizinOrg repo
    • Use tag of rz-rzilcompiler
    • Refactor rz-hexagon to build as proper python package.
    • Tag rz-hexagon as v1.1

Test plan

Added, all green

@Rot127 Rot127 added this to the 0.7.0 milestone Sep 8, 2023
@github-actions github-actions bot added the API label Sep 11, 2023
@github-actions github-actions bot added the RZIL label Oct 1, 2023
@wargio wargio changed the title [Hexagon] RzIL uplifitng [Hexagon] RzIL uplifiting Oct 29, 2023
@github-actions github-actions bot added the RzIO label Nov 13, 2023
@Rot127
Copy link
Member Author

Rot127 commented Nov 13, 2023

Open this for the first round of review now.

There are still the rz-test cases missing. Due to inner working of the RzIL generation it is not really possible to add the asm style tests (so I check it again to be sure). Instead I would simply execute most test bins to the address where they can be considered passed.

Currently the tests fail due to missing #3973

@Rot127 Rot127 marked this pull request as ready for review November 13, 2023 19:29
Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks great. Amazing job!

librz/analysis/arch/hexagon/hexagon_il.c Outdated Show resolved Hide resolved
librz/analysis/arch/hexagon/hexagon_il.c Outdated Show resolved Hide resolved
librz/analysis/arch/hexagon/hexagon_il.c Outdated Show resolved Hide resolved
librz/analysis/arch/hexagon/hexagon_il.c Outdated Show resolved Hide resolved
librz/analysis/arch/hexagon/hexagon_il.c Outdated Show resolved Hide resolved
newest = i;
}
}
RZ_LOG_DEBUG("╭─────┬──────────────┬─────┬──────────────────┬───────────────╮\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have general API for that, inside RzTable.

Copy link
Member Author

@Rot127 Rot127 Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't do the last column which is split into 4, which makes it easier to read IMHO. Do you think it is still better? Then I change it. Though it is only debug printing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just a remark, not a requirement. But it would be nice to figure out what's missing in RzTable.

librz/asm/p/asm_hexagon.c Outdated Show resolved Hide resolved
librz/il/il_routines.c Outdated Show resolved Hide resolved
librz/il/il_routines.c Outdated Show resolved Hide resolved
librz/include/rz_il/rz_il_opbuilder_begin.h Outdated Show resolved Hide resolved
@Rot127
Copy link
Member Author

Rot127 commented Nov 21, 2023

In order that the tests succeed we need rizinorg/rizin-testbins#127 merged though.
Before that maybe look at https://github.com/rizinorg/rizin/pull/3837/files#diff-a1583f59ca1cc7b3c25cde35c24fade6df510112bcf21174d5d2995a5baa14e2

@Rot127
Copy link
Member Author

Rot127 commented Nov 21, 2023

@thestr4ng3r To your notice: 722e5c0

Comment on lines +843 to +847
if (i < HEXAGON_STATE_PKTS - 1) {
RZ_LOG_DEBUG("├─────┼──────────────┼─────┼──────────────────┼───┼───┼───┼───┤\n");
} else {
RZ_LOG_DEBUG("╰─────┴──────────────┴─────┴──────────────────┴───┴───┴───┴───╯\n");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when RZ_LOG_DEBUG id not enabled this is kinda useless. maybe change this as

RZ_LOG_DEBUG(i < HEXAGON_STATE_PKTS - 1 ?
"├─────┼──────────────┼─────┼──────────────────┼───┼───┼───┼───┤\n" :
"╰─────┴──────────────┴─────┴──────────────────┴───┴───┴───┴───╯\n"
);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right. Let me add an early return to the function.

This logging is mostly there, because it visualizes nicely how the packets are filled over time. A thing I found really difficult to track just by looking at the debugger variables. This helps with finding some weird instruction buffering issues.

We can also remove it, if you feel it is unnecessary overhead.

Copy link
Member

@wargio wargio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly is ok. just a stupid detail for me.

librz/arch/asm.c Outdated Show resolved Hide resolved
Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but it would be nice to extract non-Hexagon changes in a separate PR and merge it first.

@Rot127
Copy link
Member Author

Rot127 commented Mar 21, 2024

Please let me run all rz-tracetest tests after #4373. Just to be sure.

Rot127 added 13 commits March 22, 2024 00:51
Add hexagon il code to rouce files.

Update generated files

Add undocumented ops file.

Fix build: Add missing effects.

Update generated files.

Degrade/remove very common log messages.

Rename: p -> pkt and k -> slot

Reverse renaming of k, since it is indeed the location in the packet and not the slot.

Add slot to HexInsn (needed for RzIL)

Uppdate generated files.

Disassemble a full packet if it is in the buffer.

Allow to get the packet IL ops if they are ready.

Only copy disassembly result if requested.

Fix typo

Fix packet recognition of first and jump target packets.

Remove check for theoretically unreachable effects.

The assumptions made in this line doesn't need to be true.
For Hexagon every write to a register first happens to a temporary
register file. Once the whole packet is checked and is valid,
the operations are performed.
The last step is to sync the the temporary register file with the default one.

Update generated files (16 passed test_*)

Restructure generated rzil code

Update with optimization by resolving compares and tenaries

All test_* cases succeed

Update source files with new sub-routines

Remove pseudo instructions

Update generated files (all tests pass).

Update source after rebase

Fix: Check rw overlap only for x registers.

Add RzIL tests for hexagon.

Update generated files

Remove unit test for further effect after control effect.

Update generated files.

Fix unit test.

After instructions with invalid decoded registers are decoded as invalid,
the unit test failed.

Fix syntactical changes in tests.

Use upper case register names in CC

Fix many many mem leaks.

Only generate IL code if requested.

Enable load_align test.

Fix syntax: Remove trailing ; for invalid decodes with parse_bits == 0.

Add angle brakcets type annotations.

Remove many ressource leaks.

Note that not RzAsm and RzAnalysis for Hexagon cannot be freed,
because there is currrently no reasonable way to make the plugin thread safe.

Add bitmaps

Use RzBitmap instead of manually setting bits.

Fix type annotations.

Triple test timeout for ASAN tests.

Fix type annotation.

Use PVector instead of Vector to save the HexILOp pointer.

Ignore dwarf error for now.

Apparently, mixing allocated and static memory in a vector leaks the allocated memory. rip.

Replace bitmaps with bitvectors.

Ensure packet il stats reset.

Remove warning from test
It is dictated by BAP theory that the jumps come last.
@Rot127
Copy link
Member Author

Rot127 commented Mar 22, 2024

Ok, all good now:

./run-test-sets.sh -i -t essentials,float
test_vspliceb [PASS]
test_vpmpyh [PASS]
test_vminh [PASS]
test_vmaxh [PASS]
test_vlsrw [PASS]
test_vcmpw [PASS]
test_vcmpb [PASS]
test_vavgw [PASS]
test_round [PASS]
test_reorder [PASS]
test_packet [PASS]
test_mpyi [PASS]
test_lsr [PASS]
test_jmp [PASS]
test_hwloops [PASS]
test_hl [PASS]
test_fibonacci [PASS]
test_ext [PASS]
test_dotnew [PASS]
test_cmp [PASS]
test_clobber [PASS]
test_call [PASS]
test_bitsplit [PASS]
test_bitcnt [PASS]
test_abs [PASS]
usr [PASS]
v68_scalar [PASS]
v73_scalar [PASS]
test-vma [PASS]
load_align [PASS]
multi_result [PASS]
overflow [PASS]
first [PASS]
mem_noshuf [PASS]
preg_alias [PASS]
dual_stores [PASS]
mem_noshuf_exception [PASS]
read_write_overlap [PASS]
reg_mut [PASS]
misc [PASS]
fpstuff [PASS]

@Rot127 Rot127 merged commit fb6efca into rizinorg:dev Mar 22, 2024
44 checks passed
@Rot127 Rot127 deleted the rzil-hexagon branch March 22, 2024 07:27
@Matheus-Garbelini
Copy link

Hi, just a question. Is the curent dev branch of rizin using latest changes from https://github.com/rizinorg/rz-hexagon and https://github.com/rizinorg/rz-rzilcompiler ?

@Rot127
Copy link
Member Author

Rot127 commented Sep 9, 2024

@Matheus-Garbelini Yes. If you plan doing research on Hexagon firmware you can ping me on our Mattermost as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants